Longer RNNs - Project Report
نویسندگان
چکیده
For problems with long range dependencies, training of Recurrent Neural Networks (RNNs) faces limitations because of vanishing or exploding gradients. In this work, we introduce a new training method and a new recurrent architecture using residual connections, both aimed at increasing the range of dependencies that can be modeled by RNNs. We demonstrate the effectiveness of our approach by showing faster convergence on two toy tasks involving long range temporal dependencies, and improved performance on a character level language modeling task. Further, we show visualizations which highlight the improvements in gradient propagation through the network.
منابع مشابه
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
We present a simple method to improve learning long-term dependencies in recurrent neural networks (RNNs) by introducing unsupervised auxiliary losses. These auxiliary losses force RNNs to either remember distant past or predict future, enabling truncated backpropagation through time (BPTT) to work on very long sequences. We experimented on sequences up to 16 000 tokens long and report faster t...
متن کاملRegularizing RNNs by Stabilizing Activations
We stabilize the activations of Recurrent Neural Networks (RNNs) by penalizing the squared distance between successive hidden states’ norms. This penalty term is an effective regularizer for RNNs including LSTMs and IRNNs, improving performance on character-level language modelling and phoneme recognition, and outperforming weight noise and dropout. We achieve state of the art performance (17.5...
متن کاملConnectionist Temporal Classification: Labelling Unsegmented Sequences with Recurrent Neural Networks Research Project Report – Probabilistic Graphical Models course
Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, they cannot be applied directly. Connect...
متن کاملLonger RNNs
For problems with long range dependencies, training of Recurrent Neural Networks (RNNs) faces limitations because of vanishing or exploding gradients. In this work, we introduce a new training method and a new recurrent architecture using residual connections, both aimed at increasing the range of dependencies that can be modeled by RNNs. We demonstrate the effectiveness of our approach by show...
متن کاملRecursive Nested Neural Network for Sentiment Analysis
Early sentiment prediction systems use semantic vector representation of words to express longer phrases and sentences. These methods proved to have a poor performance, since they are not considering the compositionality in language. Recently many richer models has been proposed to understand the compositionality in natural language for better sentiment predictions. Most of these algorithms are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017